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SEMANTIC EMPHASIS FOR TEXT GENERATION 



u . 

< 

in 

(N 



> 
(N 

O 

o 

I 



X 



Elke Teich, Beate FirzlafF, John A. Bateman* 

GMD/Institut fiir Integrierte Publikations- und Inforniationssysteme 
DolivostraBe 15, D-64293 Darmstadt, Germany 

e-mail: {teich, firzlaff,bateman}@darmstadt. gmd.de 
May 1994 



Paper presented at COLING '94 (Kyoto, 
Japan) but not to be found in the printed 
proceedings. 

Cmp-lg No: cmp-lg/9704012. 

Abstract 

The paper deals with the problem of text gener- 
ation and planning approaches making only lim- 
ited formally specifiable contact with accounts 
of grammar. We propose an enhancement of 
a systemically-based generation architecture for 
German [Bateman et ai, 1991] by aspects of the 
theory of semantic emphasis [Kunze, 1991]. Do- 
ing this, we gain more control over both concept 
selection in generation and choice of fine-grained 
grammatical variation. 

1 INTRODUCTION 

The extension of linguistic representation 
to levels of abstraction above syntax is 
an important theoretical goal; current ef- 
forts in this direction include [Alshawi, 1992, 
Grover et al, 1993, Jackendoff, 1990]. How- 
ever, problematic with most of these de- 
velopments is their restriction to, as it 
is termed in systemic-functional theory 
[Halliday, 1978], ideational information. As 
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[Grover et al, 1993] report, diverse sen- 
tences such as He found it in the park and It 
was in the park that he found it are assigned 
identical semantic representations by the 
Alvey grammar, and it is common for varia- 
tions such as these to be relegated to 'prag- 
matic' interpretations of invariant semantic 
forms. We claim that such variation also re- 
quires a semantic representation based on a 
textual semantics that augments the exist- 
ing ideational semantics. The importance 
of such a broadening of semantic representa- 
tions is already clear in work on text genera- 
tion [Horacek and Zock, 1993, Meteer, 1991, 
Bateman et al, 1993], and has been argued 
also for analysis [Matthiessen et al, 1991]. 
Unfortunately, accounts of text organiza- 
tion and text planning achieved within 
text generation often make only limited 
formally specifiable contact with accounts 
of grammar (e.g., [Grosz and Sidner, 1986, 
Mann and Thompson, 1987, Hovy, 1987, 
Hovy et al, 1992]); and contrariwise, for- 
mal accounts such as discourse representa- 
tion theory, although beginning to make con- 
tact with higher levels of rhetorical organiza- 
tion (e.g., [Lascarides and Asher, 1991]), are 
typically restricted to describing anaphoric 
relations and quantification. In this pa- 
per, we present one particular extension to 
a textual semantics, showing its integration 



and use in dealing with some related prob- 
lems in text generation. The extension is 
based on the semantic analysis of the struc- 
ture of lexical fields in German developed by 
Kunze (e.g., [Kunze, 1991]), combining this 
with the systcmic-functionally driven mode 
of text generation pursued in the text gener- 
ation project KOMET [Bateman et al, 1991]. 
A compu- 
tational representation of the semantic in- 
formation posited by Kunze's theory of se- 
mantic emphasis ( Theorie der semantischen 
Emphase) is under development for the lex- 
ical entries of the text analysis project KON- 
TEXT [Firzlaff and Haenelt, 1992]. We de- 
scribe here how this information is now used 
as an additional source of functional con- 
straints during grammatical decision mak- 
ing and how this allows a natural contact 
with certain text organization decisions. We 
briefly illustrate the work with two exam- 
ples: first, our approach to a central problem 
in knowledge-based natural language pro- 
cessing, that of how to relate domain models 
to levels of linguistic knowledge and process- 
ing; and second, a demonstration of the co- 
constraints between emphasis distribution 
and certain textual decisions. Finally, we 
discuss the directions that this work now 
opens up for future investigation, includ- 
ing application of NLP components to real- 
world domains [Teich et al, 1994] and gen- 
eralizations to languages other than German 
(cf. [Kunze, 1992]). 

2 EMPHASIS THEORY 

The theory 
of semantic emphasis [Kunze, 1991] proposes 
explanations concerning the meaning of situ- 
ation descriptions communicated by natural 
language texts and its relationship to possi- 
ble syntactic realizations. One aspect of the 
theory will be outlined here, namely how a 
syntactic realization depends on semantics. 



Moreover, it will be shown how grammatical 
features can be derived systematically. 

The theory provides prototypical descrip- 
tions of situations. These descriptions 
are called basic semantic schemes. They 
are given in terms of predicate-argument- 
structures called propositions. For instance, 
the basic semantic scheme for situations of 
change-of-possession is: 
( cause (act (a) 

et ( bee ( have ( al, a2 )) 

bee ( not ( have ( a3, a4 )))))) 

This can be paraphrased as: An action of 
a causes al to get a2 and aS to lose a4.^ 
Since this description is prototypical it pro- 
vides just one transferred object: it is de- 
noted by a2 and a4 because it can be re- 
garded from different points of view. Fur- 
thermore, ref(a) might either be the same as 
ref(al) or as ref(a3), but ref(al) and ref(a3) 
must be different. 

So, each participant of a situation may 
be referred to more than once in the cor- 
responding description. Each of these ref- 
erences corresponds to a specific role (deep 
case) which, in turn, mirrors a specific point 
of view towards the referent. The roles are 
derived systematically rather than being de- 
termined in a more or less intuitive way (e.g., 
[Fillmore, 1968]): they are derived accord- 
ing to a set of well-defined recursive rules 
(cf. [Kunze, 1991, pp78-89]); the derivation 
process follows the propositional structure 
bottom-up. A basic predicate has at least 
one elementary argument (represented by 
some variable) to which an initial role value 
is assigned. The other predicates only take 
propositional arguments and modify the (ini- 
tial or intermediate) role values assigned to 
the elementary arguments. For instance, the 
basic predicate have assigns the role <locat, 
have> to its first argument as initial value. 
The predicate bee further specifies <locat, 

^Thc variables are to be filled in by the names of 
the referents. 
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have> as <goal, have>. The predicate et, 
on the other hand, never changes a role. 
The predicate cause does not affect a <goal, 
have> in its second argument. 

The roles derived for the basic semantic 
scheme in question are: 

a <agens, act> 
al <goal, have> 
a2 <to-obj, have> 
a3 <source, have> 
a4 <from-obj, have> 

The second column of this table is the 
maximum case frame of verbs that can be 
used to describe a changc-of-posscssion in 
which one object is transferred. However, 
in a phrase that describes a situation only 
some roles of the maximum case frame are 
verbalized. Roles that are not verbalized are 
said to be blocked. 

Moreover, some aspect of a situation is 
put into the foreground, which means that in 
a suitable phrase, the corresponding rolc(s) 
is (are) verbalized with semantic emphasis. 
In terms of the theory of semantic emphasis 
this is reflected by the parameter of empha- 
sis, which is assigned to partial propositions 
of a basic semantic scheme. These assign- 
ments are the result of a rule-based distri- 
bution. A basic semantic scheme entails the 
information where to start the distribution. 
As far as the change-of-possession is con- 
cerned the starting point is the second ar- 
gument of the predicate cause, namely the 
proposition with the predicate et. There- 
fore the ei-proposition has emphasis. A 
proposition that has emphasis distributes it 
top-down to one of its arguments. Accord- 
ingly a proposition that has emphasis can 
only be the argument of a proposition that 
also has emphasis. Consequently, one of 
the /iat;e-propositions has emphasis, and the 
aci-proposition may have emphasis. 

According to a general rule at least one 
of the roles of a proposition with empha- 
sis must not be blocked, which means that 



it has to be verbalized. Furthermore, its 

grammatical realization must be in nomina- 
tive, genitive, dative or accusative case. The 
choice of the grammatical case mainly de- 
pends on the role. Secondly, it is determined 
by the subset of roles that are not blocked 
and belong to propositions with emphasis. 
On the other hand, the roles of propositions 
without emphasis need not be verbalized at 
all, bTit if one of them is verbalized, its gram- 
matical realization can only be by oblique 
case, i.e. by a prepositional object. The 
choice of suitable prepositions depends on 
the role.^ 

In Figure 1 we present some sample sen- 
tences. Their prepositional descriptions are 

derived from the basic semantic scheme of 
change-of-possession. Note that if we add 
emphasis information and select the roles 
that are to be verbalized, we construct the 
semantic forms derivable from the basic se- 
mantic scheme. (In the figure, " — " indi- 
cates that the corresponding role is blocked, 
i.e. no grammatical case is assigned to it. 
Propositions with basic predicates that have 
emphasis and the corresponding information 
concerning the grammatical realizations are 
in bold face.) 

For reasons of illustration we have cho- 
sen a specific lexical field. However 
the principles of the theory of seman- 
tic emphasis also apply to other fields 
[Kunze and Firzlaff, 1993], e.g., change-of- 
location, creation, measuring, verba dicendi. 
For each of these fields the theory provides a 
prototypical description (i.e. a basic seman- 
tic scheme) to which the rules we presented 
here and other constraints must be applied 
in order to derive a specification of possible 
grammatical realizations. 



^The theory of semantic emphasis has been 
worked out for German. However, most of its prin- 
ciples apply to other languages as well. 
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(1) Sie verliert den Schliissel. (She loses the key.) 
( cause (act ( <agens, act>: ref(she) . . . [ — ] ) 

et ( bee ( have ( <goal, have>: al, . . . [ ] 

<to-obj, have>: ref(key) . . . [ — ] )) 
bee ( not ( have ( <source, have>: ref(she), . . . [ nominative ] 

<from-obj, have>: ref(key) . . . [ accusative ] )))))) 

(2) Sie wirft den Schliissel weg. (She throws away the key.) 
( cause (act ( <agens, act>: ref(she) . . . [ nominative ] ) 

et ( bee ( have ( <goal, have>: al, . . . [ — ] 

<to-obj, have>: ref(key) . . . [ — ] )) 
bee ( not ( have ( <source, have>: ref(she), . . . [ — ] 

<from-obj, have>: ref(key) . . . [ accusative ] )))))) 

(3) Er schickt ihm cine Einladung. (He sends him an invitation.) 
( cause (act ( <agens, act>: ref(he) . . . [ nominative ] ) 

ct ( bcc ( have ( <goal, have>: ref(him), . . . [ dative ] 

<to-obj, have>: ref(invitation) . . . [ accusative ] )) 
bee ( not ( have ( <source, have>: ref(he), ... [ — ] 

<from-obj, have>: ref(invitation) ... [ — ] )))))) 

(4) Er schickt eine Einladung an ihn. (He sends an invitation to him.) 
( cause (act ( <agens, act>: ref(he) . . . [ nominative ] ) 

et ( bee ( have ( <goal, have>: ref(him), . . . [ to-phrase ] 

<to-obj, have>: ref (invitation) . . . [ — ] )) 
bee ( not ( have ( <source, have>: ref(he), . . . [ — ] 

<from-obj, have>: ref(invitation) . . . [ accusative ] )))))) 

Figure 1: Basic semantic forms of sample sentences 
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3 GENERATION 
ARCHITECTURE 

The general architecture of the komet 
system has been described in detail else- 
where [Bateman et al., 1991]; it follows very 
closely the modularities entailed by the lin- 
guis- 
tic stratification assumed within systemic- 
functional linguistics (e.g., [Halliday, 1978]). 
Of most relevance here is the necessity of 
specifying the relationship between an ab- 
stract grammatically-oncntcd semantics and 
an account of the context of situation. This 
relationship underlies the main reason for 
adopting a systemic-functional orientation 
in text generation: grammatical and lexi- 
cal decisions are related to the deployment 
of communicative goals in their communica- 
tive context. The project includes the devel- 
opment of a large systemic-functional gram- 
mar of German [Teich, 1992] and the con- 
struction on the basis of the original Penman 
English Upper Model [Bateman et al., 1990] 
of a revised upper model ontology that 
spans both the semantic requirements 
of German and English [Henschcl, 1993, 
Henschel and Bateman, 1994]. Input to 
the grammar component is expressed in 
the Penman Sentence Plan Language ( 
SPL) [Kasper, 1989]. However, in contrast 
to Penman, where the SPL is largely equiv- 
alent to A-Box assertions made against 
a T-Box component combining the Up- 
per Model and domain model, in KOMET 
we allocate the generation system exter- 
nal domain model to the higher stratum 
of context. This provides the theoretical 
space for the flexible mapping from domain 
model concepts to Upper Model concepts 
required. Context is organized into three 
areas [Halliday, 1978, Martin, 1992] — only 
one of which, field (the socially signifiant 
activities, participant-types and activity se- 
quences of the communicative context), is 



relevant to us here. 

We relate the information of the theory 
of semantic emphasis to this architecture as 
follows. We adopt basic semantic schemes 
as abstract general characterizations of a 
subtype of the field of context; i.e., one of 
the contexts in which interlocutors can un- 
derstand themselves to be is classifiable ab- 
stractly as, for example, an exchange of (gen- 
eralized) possession. This is then related 
to the semantic classes available in the Up- 
per Model by means of realization: accord- 
ing to the distribution of semantic empha- 
sis over the basic semantic scheme, a par- 
ticular Upper Model concept is selected as 
appropriate together with a particular con- 
figuration of semantic roles. This semantic 
specification then forms the basis of possible 
SPL expressions that can be passed on to the 
lexicogrammar for expression. As described 
in [Matthiessen and Bateman, 1991, Chap- 
ter 9], the Upper Model is only one of three 
bodies of semantic information necessary for 
generation. Semantic emphasis distribution 
also shows co-constraints with decisions in 
another component, the Text Base, where 
information concerning textual statuses such 
as thematicity, given- /newness, identifiabil- 
ity, etc. is maintained for constraining those 
grammatical decisions sensitive to such dis- 
tinctions. To these we add a component for 
grammaticalized semantic emphasis, which 
represents the distribution of semantic em- 
phasis that is visible from the grammar.^ In 
the examples below, we will represent such 
textual statuses as additional annotations 
present in the SPL semantic specifications; 
this is the normal way in which textual in- 
formation is captured in SPL. 



^This is entirely analo- 

gous to Jackendoff's [JackendofF, 1983, p404/5] view 

of 'argument structure' as an abbreviation for that 
part of conceptual structure that is "visible from the 
syntax" — this is simply extended systemically to 
include representations of textual statuses. 
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4 EXAMPLES 

Given that each semantic form both 
has a prepositional content {Sachverhalt- 
sreprdsentation) and indicates, given a par- 
ticular emphasis distribution, a particular 
textual status of the participants in the 
proposition, we elaborate two examples of 
how we can make use of these two aspects in 
our generation architecture: 

• Example 1: of emphasis information 
providing grounds on which a process 
type in the Upper Model can be chosen. 

• Example 2: of the mutual constraints 
between emphasis information and tex- 
tual statuses. 

Example 1: Choice of Upper Model 
type 

Since the introduction of the Penman Up- 
per Model (e.g., [Bateman et al, 1990]) in 
1985, interfacing with a generation system 
by means of an abstract linguistically mo- 
tivated 'ontology' has become widespread 
(cf., e.g., the ontologies of the LILOG sys- 
tem [Klose et al, 1992] and many others; 
see [Bateman, 1992] for extensive discus- 
sion). Although this is usually achieved 
by direct subordination of domain concepts 
to 'Upper Model' (or equivalent) concepts, 
this is known to be insufficient — domain 
concepts often need to change their Upper 
Model classification depending on their ap- 
pearance in particular texts and text organi- 
zations. Here we illustrate how this general 
problem of flexibly allocating domain con- 
cepts to appropriate Upper Model concepts 
can be partially solved by an allocation of 
the semantic emphasis theory. 

Our illustration of the control of seman- 
tic choice by emphasis information is drawn 
from the field of change-of-possession. This 
already constrains the possible choices of an 
Upper Model type of process of a proposition 



to be verbalized to action.'^ Without this 

specification of field, choice between all four 
process types in the Upper Model is com- 
pletely open. As shown in Section 2, a field 
specification of change-of-possession has the 
maximum case frame: <agens, act>, <goal, 
have>, <to-obj, have> <source, have> and 
<from-obj, have>. As an example, we will 
consider the cases where two or three of these 
five roles have been blocked according to 
particular emphasis distributions. In such 
cases, a process type action with an instanti- 
ation of two Upper Model roles must be cho- 
sen. If the <agens,act> is blocked and the 
<source, have> and the <from-obj, have> 
have emphasis and are not blocked or if both 
the <agens,act> and the <from-obj,have> 
have emphasis and are not blocked and the 
<source,have> has emphasis and is blocked, 
then action process, subtype dispositive- 
material- action, must be chosen (the rele- 
vant information is highlighted):^ 




A representation of a situation type, such 
as change-of-possession, plus emphasis infor- 
mation thus makes it possible to constrain 

^By application of a notion similar to Jackend- 
off's [Jackendoff, 1990, p26] 'semantic field feature'. 

^This situation corresponds to examples (1) and 
(2) in Figure 1. Note also that in the case of 
the <agens,act> not being blocked, it will be co- 
referential with the <source, have>. 
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choice and, in a number of cases, even deter- 
mine choice of a concept in the Upper Model. 

Example 2: Emphasis distribution 
and textual status 

For illustration of the control of empha- 
sis distribution by textual statuses, we have 
chosen the example of dative shift. Dative 
shift is motivated by the perspective on a 
process and the focus/nonfocus on a partic- 
ular participant in that process in a text, i.e., 
very broadly, dative shift is textually moti- 
vated. More specifically, it is motivated here 
by invoking a specific emphasis distribution. 
We represent emphasis information as at- 
tributed to Upper Model roles of the SPL rep- 
resentation of a clause in terms of inquiries.^ 
Only with the emphasis information can we 
distinguish between a dative-shifted and a 
nondative-shifted grammatical realization of 
the ideational part of the SPL:''' 

Sample SPL 1: Er schickt ihm eine Ein- 
ladung. (He sends him an invitation.) 

(send / directed-action 

: actor (he / person) 
: recipient 

(him / person 

:emphasis-q emphatic) 
:actee (invitation / object)) 

Sample SPL 2: Er schickt eine Einladung 
an ihn. (He sends an invitation to him.) 

(send / directed-action 

: actor (he / person) 
: recipient 

(him / person 

:emphcisis-q nonemphatic) 
:actee (invitation / object)) 

In sample SPL 1, the recipient is verbalized 
with emphasis (by dative case) and the actee 
is in the focus position. In sample spl 2, 
on the other hand, it is the recipient that 

^For a description of the mechanism of the 
chooser/inquiry interface between semantics and 
grammar in Penman-type generation architectures, 
of which KOMET is an example, see [Mann, 1983]. 

These correspond to examples (3) and (4) in Fig- 
ure 1. 



is in focus position. Here, it does not have 
emphasis (and is thus assigned oblique case) . 

In our 
current architecture [Bateman et al., 1993], 
this kind of textual variation is represented 
in a local-level discourse semantics that 
mediates information between the global- 
level discourse organization (represented as 
stages in a generic structure potential (GSP; 
cf. [Hasan, 1978])) and rhetorical structures 
(RST; cf. [Mann and Thompson, 1985])) 
and the grammar. The local-level discourse 
semantics (based on [Martin, 1992]) contains 
textual linguistic information that controls 
the textually-relevant options in the gram- 
mar, such as topic and focus selection (cf. 
[Sgall et al., 1986]), reference and informa- 
tion structure. Given a representation of 
propositional content, the text planner keeps 
track of textual decisions in thematic devel- 
opment and reference attribution and selects 
from the textually-relevant emphasis poten- 
tial that option that is appropriate in a given 
context. Consider a piece of text as it typ- 
ically occurs in the domain we deal with in 
[Teich et al, 1994] (arts and artists' biogra- 
phies) that provides a context for the choice 
of emphasis distribution in sample SPL 1: 
(1) Seit 1898 beschdftigte sich Behrens mit 
den Gestaltungsproblemen von Industriepro- 
dukten. (2) Er entwarf unter anderem 
Flaschen fur die Serienherstellung in einer 
grossen Glasfabrik. (3) 1899 schickte der 
Grossherzog von Hessen ihm eine Einladung 
nach Darmstadt zu kommen und sich einer 
Gruppe junger Kiinstler anzuschliessen ...^ 

Typically, in biography texts, the artist 
the text deals with acts as the hypertheme of 
the text. Moreover, one of the typical the- 
matic developments by which a biography 

^English: In 1898 Behrens turned to problems 
of industrial production and designed a number of 
prototype flasks for mass production by a large glass 
works. In 1899 the Grand Duke of Hessen sent him 
an invitation to come to Darmstadt and join a group 
of young artists. . . 
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text proceeds is selecting temporal locations 
(as in (1)) or reselecting the hypertheme as 
theme of the next sentence (as in (2)). This 
textual organization is accordingly produced 
by the text planner. Then, given this textual 
status, all references to the participant con- 
stituting the hypertheme (here: Behrens) 
belong to information already introduced. 
Being the hypertheme and the given infor- 
mation is reflected in the participant receiv- 
ing emphasis status. Grammatically, this 
is realized in the assignment of nonoblique 
case to the participant and its ordering in 
the clause. In sentence (3), the recipient role 
thus receives emphasis status and cannot ap- 
pear in the focus position which is generally 
reserved for pieces of information that are 
new in the discourse and not thematic (cf. * 
Er schickte eine Einladung ihrn) . The prob- 
lematic gap between the high-level textual 
organization and grammatical expression is 
thus appropriately bridged. 

5 CONCLUSIONS, SIG- 
NIFICANCE AND FUTURE 
WORK 

We have shown that emphasis information 
can provide more control of choices in gen- 
eration on the higher strata of the linguis- 
tic system (semantics). Ideationally, empha- 
sis distribution and blocking of roles con- 
strains the possible process types of the Up- 
per Model to be chosen. In the grammar, 
it consequently constrains choice in case as- 
signment. We have also sketched the aspect 
of emphasis theory that is relevant for tex- 
tual decisions in thematicity and informa- 
tion structure which attribute certain tex- 
tual statuses to the participants in the dis- 
course. These are reflected grammatically 
for example in relation changing phenomena 
such as dative shift. In a current application 
of NL analysis and generation components 



to the domain of arts and artists' biogra- 
phies [Rostek et al, 1994], the mechanisms 
described above in the discussion of example 
1 provide one component of a domain model 
that is used both by analysis and generation 
[Teich et al, 1994]. 

Some next steps for this work are clear. 
Many of the examples put forward by Kunze 
are argued in terms of textual acceptabil- 
ity that goes beyond single clauses. We are 
now, therefore, investigating the relationship 
of emphasis information and textual statuses 
we have sketched in the discussion of exam- 
ple 2 more closely, also considering other 
fields, such as creation, change-of- location 
and verba dicendi. 

A further step is to investigate the mul- 
tilingual applicability of the framework — 
for example, in [Kunze, 1992], Kunze pro- 
poses an analogous treatment for the field 
of change-of-possession in English. It will 
be interesting to investigate the applicabil- 
ity to English of a detailed account that has 
been worked on the basis of a language other 
than English — the reverse of what nor- 
mally occurs! Semantic emphasis may sup- 
port an improved interface between textual 
organization and grammatical decisions for 
English also, although, at least in a systemic- 
functional account, somewhat more func- 
tionally differentiated proposals have been 
made for the phenomena that Kunze gathers 
together under semantic emphasis (for ex- 
ample, given-new information, thcmc-rheme 
information, and modal responsibility of the 
grammatical subject — all of which axe inde- 
pendently variable; cf. [Martin, 1992]). The 
precise relationship of semantic emphasis to 
these needs to be clarified. 
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